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geographic database. Data representing positions 
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vehicle which is driven on the roads. The data acquired 
while driving may be smoothed and fused. The data 
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represented positions to align with the centerlines of the 
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D scription 

REFERENCE TO RELATED APPLICATION 

[0001] The present application is related to the s 
copending application entitled "METHOD AND SYS- 
TEM FOR AUTOMATIC GENERATION OF SHAPE 
AND CURVATURE DATA FOR A GEOGRAPHIC DATA- 
BASE" filed on even date herewith, Attorney Docket No, 
N0024US, the entire disclosure of which is incorporated to 
by reference herein. 

BACKGROUND OF THE INVENTION 

[0002] The present Invention relates to a process 15 
and system for collecting data about roads located in a 
geographic area and using the collected data to develop 
representations of the positions and shapes of the 
roads for a geographic database. 

[0003] Geographic databases have various uses. 20 
Geographic databases are used in in-vehicle navigation 
systems, personal computers, networked computing 
environments, and various other kinds of platforms, as 
well as Internet applications. Geographic databases are 
used with various kinds of applications to provide many 
different functions, including map display, route calcula- 
tion, route guidance, truck fleet deployment, traffic con- 
trol, electronic yellow pages, emergency services, and 
so on. Geographic databases are also used with vari- 
ous types of drivers' assistance features such as obsta- 
cle warning and avoidance, curve warning, advanced 
cruise control, headlight aiming, and so on. 
[0004] In order to provide these kinds of functions, a 
geographic database includes data that represent phys- 
ical features in a covered geographic region. Physical 
features that are represented by geographic databases 
include roads, points of interests, railroad tracks, bodies 
of water, intersections, and so on. With respect to navi- 
gable roads, geographic databases way include data 
about the various characteristics of the represented 
roads, such as the geographic coordinates of roads, 
speed limits along road segments, locations of stop 
lights, turn restrictions at intersections of roads, address 
ranges, street names, and so on. Geographic data- 
bases may also include information about points of 
interest in covered regions. Points of interest may 
include restaurants, hotels, airports, gas stations, stadi- 
ums, police stations, and so on. 
[0005] Collecting information for a geographic data- 
base is a significant task. Not only is the initial collection 
of information a significant undertaking, but a geo- 
graphic database needs to be updated on a regular 
basis. For example, new streets are constructed, street 
names change, traffic lights are installed, and turn 
restrictions are added to existing roads. Also, new levels 
of detail may be added about geographic features that 
are already represented in an existing geographic data- 
base. For example, existing data about roads in a geo- 



graphic database may be enhanced with information 
about lane widths, shoulder sizes, lane barriers, 
address ranges, sidewalks, bicycles paths, etc. Thus, 
there exists a need to continue to collect information for 
a geographic database. 

[0006] Included among the most important types of 
data in a geographic database are the positions and 
geometry (i.e., shapes) of roads. Using a GPS system, 
a person can determine his/her geographic coordinates 
on the surface of the earth. However, in order for the 
person to know what road he/she is on, it is required to 
know the geographic coordinates of the roads around 
the person in order to relate the person's geographic 
coordinates to the geographic coordinates of the roads. 
[0007] How a geographic database represents the 
positions and geometry of roads is an important consid- 
eration that can affect the usefulness of the geographic 
database. The manner in which roads are represented 
in a geographic database can affect the kinds of appli- 
cations that can use the data in the geographic data- 
base. 

[0008] Geographic databases represent positions 
of roads by identifying the geographic coordinates of 
points along the roads. According to a prior method, a 
geographic database developer-technician performed 
the step of selecting points along a road to be used to 
represent the road in a geographic database. The geo- 
graphic database developer-technician viewed an 
image of the road and, while viewing the image, esti- 
mated the locations of points from the image to use to 
represent the road. 

[0009] The image of the road that was viewed by 
the database developer-technician could be obtained by 
various means. One way to obtain an image of the road 
was to use aerial photographs of the roads. Another 
way to obtain an image of the road is to view a trace of 
GPS data acquired while driving along the road. Still 
another way to obtain an image of the road was to use 
ground-based photographs. Regardless of the means 
by which the image of the road was obtained, the geo- 
graphic database developer-technician selected points 
from the image of the road and the geographic coordi- 
nates of these points were used to represent the road in 
the geographic database. For straight road segments, 
the database developer-technician identified the geo- 
graphic coordinates of the intersections at each end of 
the straight road segment. For a curved road segment, 
the database developer-technician selected one or 
more points along the curved portion of the road seg- 
ment to approximate the location of the road. 
[0010] Although this process worked well, there is 
room for Improvement. Aerial photographs, as well as 
other images from which points along roads could be 
selected, provide only a limited amount of detail. In 
addition, aerial photographs and other kinds of images 
of roads are useful for acquiring only certain kinds data 
about geographic features. For example, aerial photo- 
graphs of roads are not useful for identifying the loca- 
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tions of road signs or address ranges along the roads. 
Thus, if an aerial photograph is used to determine the 
geographic coordinates of locations on roads, it is still 
necessary for a geographic database developer-techni- 
cian to physically travel along the road segments shown 
on the aerial photograph to acquire data about the fea- 
tures that cannot be discerned from the aerial photo- 
graph. This increases the cost of acquiring information 
for a geographic database. Similar limitations are asso- 
ciated with images obtained by other means. 
[0011] Another area in which there is room for 
improvement relates to consistency. When an Image of 
a road is used to determine points along the road to rep- 
resent the road, the selection of points depends to some 
extent upon the judgment of the geographic database 
developer-technician. Therefore, the points chosen to 
represent a road may not be consistent between differ- 
ent geographic database developer technicians. This is 
especially the case for curved portions of roads. 
[0012] Accordingly, there exists a need for an 
improved process to collect data about the locations of 
physical features for a geographic database. In addition, 
there exists a need for an improved process and/or sys- 
tem to collect data about positions and shapes of roads 
and use the collected data to represent the roads in a 
geographic database. 

SUMMARY OF THE INVENTION 

[0013] To address these and other objectives, the 
present invention comprises a process and system for 
collecting data about positions of roads located in a 
geographic area and using the collected data to develop 
representations of the roads for a geographic database. 
Data representing positions along roads are acquired 
using equipment installed in a vehicle which is driven on 
the roads. The data acquired while driving may be 
smoothed and fused. The data acquired while driving 
are processed by a program that automatically deter- 
mines new coordinates to adjust the represented posi- 
tions to align with the centerlines of the represented 
roads. Data including the new coordinates are stored in 
the geographic database. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0014] 

Figure 1 is a diagram illustrating a coverage area in 
which an embodiment of the present Invention for 
collecting data for a geographic database can be 
used. 

Figure 2 is a diagram illustrating a process tor form- 
ing derived database products from the primary 
version of the geographic database shown in Figure 
1. 

Figure 3 is map illustrating an assignment area 
which is located in the coverage area shown in Fig- 



ure 1 and which contains geographic features about 
which data will be collected for the primary version 
of the geographic database. 
Figure 4 is an illustration of a portion of a road in the 

5 geographic area shown in Figure 3. 

Figure 5 is a block diagram showing components of 
a data record in the geographic database used to 
represent the road shown in Figure 4. 
Figure 6 is an illustration of a portion of another por- 

10 tion of a road in the geographic area shown in Fig- 
ure 3. 

Figure 7 is a flow diagram of a process according to 
a first embodiment for forming data that represents 
roads for the geographic database of Figure 2. 

15 Figure 8 is a block diagram of the components of 
the equipment installed in the vehicle used in the 
data collection step shown in Figure 7. 
Figure 9 is an illustration of a road upon which a 
vehicle is being driven for collecting road shape 

20 data according to an embodiment of the process 
shown in Figure 7. 

Figure 10 is an enlargement of a portion of the illus- 
tration of Figure 9 and shows raw data points. 
Figure 1 1 shows the same portion of the road as 

25 shown in Figure 10 and shows fused data points 
derived from the raw data points of Figure 10. 
Figure 12 shows the same portion of the road as 
shown in Figures 10 and 11 and shows smoothed 
data points derived from the fused data points in 

30 Figure 11. 

Figure 13 shows the same portion of the road as 
shown in Figures 1 0-12 and shows an optional step 
of removing outliers. 

Figure 14 shows the same portion of the road as 
35 shown in Figure 13 and shows smoothed data 
points after removal of the outliers in Figure 13. 
Figure 15 is a flow diagram of the steps in a portion 
of the process shown in Figure 7 for automatically 
selecting which of the collected data point to be 
40 used to form shape points for the geographic data- 
base. 

Figures 16A-16E show application of the process of 
Figure 15 to automatically generate shape points 
for a geographic database. 
45 Figures 17A-17E illustrate an alternative process 
for automatically generating shape points for a geo- 
graphic database. 

Figures 18A-18E show application of another alter- 
native process for automatically generating shape 

50 points for a geographic database. 

Rgure 19 is a flow diagram of the steps in a portion 
of the process shown in Figure 7 for automatically 
adjusting the selected data shape point to account 
for the vehicle location while collecting data, 

55 Figures 20A-20D show application of the process of 
Figure 19 to automatically centeriine shape points 
for a geographic database. 
Figures 21A-21D show an alternative process for 
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automatically centerllning shape points for a geo- 
graphic database. 

DETAILED DESCRIPTION OF THE PRESENTLY PRE- 
FERRED EMBODIMENTS 5 

I. OVERVIEW 

[0015] A first embodiment is described with refer- 
ence to Figures 1 through 20D. Figure 1 shows a pri- io 
mary copy 1 00 of a geographic database. The primary 
copy 100 of the geographic database Includes data 102 
that represent geographic features in a coverage area 
108. The coverage area 108 may correspond to an 
entire country, such as the United States. Alternatively, is 
the primary copy 1 00 of the geographic database may 
correspond to several countries, such as the United 
States, Canada, and Mexico, or France, Germany, and 
Italy, and so on. According to another alternative, the 
primary copy 100 may represent only a single region 20 
within a country, such as the West Coast or the Midwest 
of the United States. The primary copy 100 of the geo- 
graphic database is maintained as the copy or version 
that has the most up-to-date data relating to the cover- 
age area 108. Although the primary copy 100 of the 25 
geographic database includes data that represent geo- 
graphic features in the entire coverage area 108, there 
may be parts of the coverage area 1 08 that contain geo- 
graphic features that are not represented by data in the 
geographic database, or for which the coverage is 
sparse. 

[0016] As stated above, the data 1 02 In the primary 
copy 100 of the geographic database represents geo- 
graphic features in the covered area 108. The data 102 
includes various attributes of the represented geo- 
graphic features. For example, included in the primary 
copy TOO of the geographic database are data that rep- 
resent roads and data that represent attributes of roads, 
such as the geographic coordinates of positions on the 
roads, the curvature at points along the roads, the street 
names of the roads, the addresses ranges along the 
roads, turn restrictions at intersections of roads, and so 
on. The geographic data 102 may also include informa- 
tion about points of interest in the covered area 108. 
Points of interest may include hotels, restaurants, muse- 
ums, stadiums, offices, automobile dealerships, auto 
repair shops, etc. The geographic data 102 may include 
data about the locations of these points of interests. The 
geographic data 1 02 may also include information about 
places, such as cities, towns, or other communities. The 
geographic data 102 may include other kinds of infor- 
mation. 

[0017] The primary copy 100 of the geographic 
database is updated, expanded, and/or othenvise mod- 
ified on a regular and continuing basis. The primary 
copy 100 of the geographic database is physically 
located at a first location 114. In one embodiment, the 
primary copy 100 of the geographic database is stored 



on one or more hard drives and accessed with a main- 
frame computer 116, such as an Amdahl or IBM main- 
frame computer. One or more backup copies are also 
maintained. 

[0018] In one embodiment, the geographic data 
102 are maintained and developed by Navigation Tech- 
nologies Corporation of Rosemont, Illinois. However, it 
is understood that the inventive concepts disclosed 
herein are not restricted to any particular source of data. 
[0019] As illustrated in Figure 2. the primary copy 
1 00 of the geographic database is used to make derived 
database products 110. The derived database products 
110 made from the primary copy 100 may include only 
portions of all the data in the primary copy 100. For 
example, the derived database products 110 may 
include data that relate to only one or more specific 
regions located within the coverage area 108 of the pri- 
mary copy 1 00. 

[0020] The derived database products 110 are 
used by various applications. For example, the derived 
database products 110 may be used for navigation- 
related applications, such as route calculation, route 
guidance, vehicle positioning, and map display The 
derived database products 110 may also be used by 
applications that provide vehicle safety or control func- 
tions, such as obstacle avoidance, automatic cruise 
control, accident avoidance, automatic curve detection, 
automatic headlight aiming, and so on. The derived 
database products 1 1 0 may also be used lor other kinds 
of functions, such as electronic yellow pages, etc. 
[0021] The derived database products 110 may be 
used on various kinds of computing platforms 112. For 
example, the derived database products 110 may be 
used in navigation systems (such as in-vehicle naviga- 
tion systems and hand-held portable navigation sys- 
tems), personal computers (including desktop and 
notebook computers), and other kinds of devices (such 
as PalmPilot®-type devices, pagers, telephones, per- 
sonal digital assistants, and so on). Derived database 
products 110 may also be used on networked comput- 
ing platforms and environments, including the Internet. 
[0022] The derived database products 110 made 
from the primary copy 100 may be in a format which is 
different from the format in which the primary copy 100 
of the database is maintained. The derived database 
products 1 1 0 may be in a format that facilitates the uses 
of the derived products on the platforms in which they 
are installed. The derived database products 110 may 
also be stored in a compressed format on the media on 
which they are located. 

[0023] The derived database products 1 10 may be 
stored on media that are suitable for the hardware plat- 
forms in which they are installed. For example, the 
derived database products may be stored on CD-ROM 
disks, hard drives. DVD disks, flash memory, or other 
types of media that are available now or that become 
available in the future. 

[0024] As mentioned previously, the primary copy 
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100 of the geographic database includes the most up- 
to-date data relating to the coverage area 108. Proc- 
esses are used to update, check, and expand the cover- 
age of the data 102 In the primary copy 100 of the 
geographic database on a regular basis. Expanding the 5 
coverage of the database includes adding data records 
to represent geographic features that had not already 
been represented by records in the geographic data- 
base. For example, within a coverage area (such as the 
area 108 in Figure 1), there may be sub-areas that are 10 
not represented. Expanding the coverage of the data- 
base also includes adding data for new developments, 
e.g., new subdivisions. Expanding the coverage may 
also Include adding more detail for areas or features 
that are already represented. In addition to expanding 15 
the coverage of the geographic database, there is a 
continuous need to update and check the existing data 
In the database. For example, speed limits may change, 
turn restrictions may change, etc. 

[0025] Referring again to Figure 1 , the processes of 20 
updating, checking and expanding the database are 
performed by staff at one or more field offices 118. The 
field offices 1 18 are located in the geographic area cor- 
responding to the coverage area 108 of the primary 
copy 100 of the geographic database. Each field office 25 
118 may be associated with a separate portion 120 of 
the entire coverage area 108. Each field office 118 
includes the appropriate computing equipment, includ- 
ing hardware and software, so that data can be 
exchanged between the field office and the main com- 30 
puter 116. For example, each field office 118 may 
include one or more workstation computers 121 upon 
which are installed various programs 122. Included 
among these programs 122 are programs for process- 
ing and manipulating raw data collected by researchers 3$ 
while out in the field, programs for communicating with 
the main computer 116 in order to access the primary 
copy of the geographic database, and programs for add- 
ing or modifying data in the primary copy of the geo- 
graphic database as part of an updating process. In one 40 
embodiment, the field offices 118 and the main compu- 
ter 1 1 6 are connected with a data network 1 24. The net- 
work 124 may be a wide area network (WAN), the 
Internet, or any other kind of technology that enables 
the exchange of data between the main computer 116 45 
and the field offices 118. 

[00261 Each of the field offices 1 18 is staffed with 
one or more technicians (referred to herein as 
"researchers"). The researchers perform several func- 
tions. The researchers collect data for the primary copy so 
100 of the geographic database. The researchers may 
add data about geographic features that had not previ- 
ously been included in the primary copy 100 of the geo- 
graphic database. The researchers may also check 
data about geographic features that are already repre- 55 
sented in the primary copy 100 of the database to 
assure that the data are con-ect and up-to-date. 
[0027] The data collection activities of a researcher 



are organized into assignments. Referring to Figure 3. 
each assignment is associated with an assignment area 
200. The assignment area 200 is a physical geographic 
area that contains geographic features about which the 
researcher collects data for updating or expanding the 
primary copy 100 of the geographic database. Included 
among the geographic features about which the 
researcher collects data is the road network. Figure 3 
illustrates a portion of the road network 206 in the cov- 
erage area 108. 

[0028] The assignment area 200 is typically a rela- 
tively small portion of the coverage area 108. The 
assignment area 200 may be within the part 120 of the 
coverage area assigned to the field office. The size of 
the assignment area 200 may depend upon various fac- 
tors, such as the kinds of data being collected, the dis- 
tance of the assignment area from the field office, the 
density of geographic features in the assignment area, 
and so on. For example, the assignment area 200 may 
be several square miles, or alternatively the assignment 
area 200 may be hundreds of square miles. 
[0029] Although data about some types of geo- 
graphic features can be collected without leaving the 
location of the field office (using aerial photographs, as 
mentioned above), collection of data for other types of 
geographic features may require that the researcher 
physically observe the geographic feature. Thus, a 
researcher may have to travel to the assignment area to 
collect some types of data. 



II THE GEOG PAPHin DATABASE 

[0030] The geographic database 100 (in Figure 1) 
contains various kinds of information about roads and 
other features in the covered region. One important kind 
of information contained in the geographic database is 
data defining the locations of roads. Locations of roads 
may be represented in a geographic database in vari- 
ous different ways. One way to represent a location of a 
road is to include geographic coordinates of positions 
along the represented road. This type of representation 
is described in connection with Figures 4, 5 and 6. 
[0031] Figure 4 illustrates one road segment 210 
which is part of the road network 206 shown in Figure 3. 
The road segment 210 extends between an intersection 
INT(1) and an intersection INT(2). In Figure 5. the geo- 
graphic database 100 includes a data record 222 that 
represents the road segment 210. The data record 222 
may include a record ID 222(1). Stored with the data 
record 222 that represents the road segment 210 is 
data 222(2) identifying the geographic coordinates 
(e.g., latitude, longitude, and optionally altitude, grade, 
curvature) of the left and right nodes located at the end- 
points of the road segment. With respect to the road 
segment 210. the geographic coordinates of the left 
node correspond to the geographic coordinates of the 
intersection INT(1) and th geographic coordinates of 
the right node corr spond to th geographic coordi- 
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nates of the intersection INT(2). The geographic coordi- 
nates that are stored for the node data 222(2) may be 
expressed as absolute coordinates or relative coordi- 
nates. 

[0032] For some applications, it is important to 
know the location(s) of a road segment between its end- 
points. If a road segment is straight (in two dimensions 
if the geographic database includes only latitude and 
longitude and in three dimensions if the geographic 
database includes latitude, longitude, and altitude), the 
locations of all points along the road segment can be 
determined by calculating a straight line between the 
geographic coordinates of the nodes at the endpoints of 
the road segment. However, if the road segment is 
other-than-straight (such as the road segment 210 in 
Figure 4), additional data are needed to determine the 
location of a point along the road segment. According to 
one embodiment, shape point data are used for this pur- 
pose. 

[0033] Referring to Figure 5, shape point data 
222(3) are stored in the data record 222 that represents 
the road segment 210. The shape point data 222(3) 
include one or more entries. Each .entry in the shape 
point data 222(3) contains data indicating the geo- 
graphic coordinates (e.g., latitude and longitude), and 
optionally additional data, such as altitude, curvature, 
and road grade, of a separate shape point along the 
road segment. (The geographic coordinates, altitude, 
curvature, and road grade stored for shape point data 
entry 222(3) may be absolute values or relative values.) 
A shape point is a location along a road segment 
between its endpoints. In Figure 4, shape points are 
shown located between the endpoints of the road seg- 
ment 210. For each of the shape points shown in Figure 
4, an entry is stored in the shape point data 222(3) 
stored in the record that represents the road segment 
210 in the geographic database 1 00. 
[0034] In the embodiment described in connection 
with Figures 4 and 5, the data record 222 that repre- 
sents the road segment 210 includes shape point data 
identifying points located along a centerline of the rep- 
resented road segment. There are alternative ways in 
which the shape of a road segment may be repre- 
sented. The manner in which a road is represented may 
be related to the geometry of the road. For example, if 
the road is divided by a median, separate sets of shape 
point data and node data may be used to represent the 
separate groupings of lanes on each side of the median, 
[0035] An example of a road divided by a median is 
shown in Figure 6. As shown in Figure 6, a road seg- 
ment 211 has lanes divided by a median. In Figure 6. a 
separate set of shape points is associated with the 
grouping of lanes on each side of the median. A data- 
base record (similar to record 222 shown in Figure 5) 
that represents the road segment in Figure 6 includes 
separate sets of shape point data for the shape points 
on each side of the median. If the lanes on each side of 
a median are represented by separat sets of shape 



point data, the shape points for each grouping of lanes 
may be located along the centerline of the grouping of 
lanes to which they are associated. In the example 
shown in Figure 6, there are three lanes on each side of 
5 the median. The shape points for the road segment 21 1 
are located along the center of the middle lane on each 
side of the median. 

[0036] It can be appreciated that storing shape 
point data can take a significant amount of data storage 
10 capacity. Various means of data compression may be 
used to minimize the size of the amount of data that has 
to be stored. 
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[0037] The data used in the geographic database 
100 to represent the positions of roads and other geo- 
graphic features are the result of several processes. 
According to an embodiment shown in Figure 7, these 
20 processes include collection processes 300 and shape 
point formation processes 301. These processes, 300 
and 301 , may be performed using equipment and pro- 
grams, as described in more detail below. The equip- 
ment and programs may be used by the researchers in 
25 assignment areas and/or the field offices, or alterna- 
tively, the equipment and programs may be used by 
technicians located elsewhere. 
[0038] A first step 302 in the data collection proc- 
esses 300 includes driving a vehicle equipped with data 
30 acquisition equipment along the roads for which road 
position and geometry data are to be obtained. Figure 8 
shows the components of an embodiment the data 
acquisition equipment 303 installed in the vehicle 304. 
As shown in Figure 8, the equipment 303 installed in the 
35 vehicle 304 includes a positioning system 306. The 
positioning system 306 is used to obtain the geographic 
coordinates of the vehicle 304 as the vehicle 304 is 
being driven along the roads. As shown in Figure 8, the 
positioning system 306 includes both a GPS system 
40 component 307 and an inertial sensor component 308. 
The GPS system component 307 acquires geographic 
coordinates of the vehicle 304 using GPS satellite sig- 
nals. The inertial sensor component 308 acquires data 
indicative of relative movement of the vehicle 304 in 
45 three dimensions, including data indicative of such vehi- 
cle acceleration, velocity, distance traveled. From these 
data the relative geographic coordinates can be 
obtained. 

[0039] Also included in the vehicle 304 is a portable 
50 computer 309. Installed on the portable computer 309 is 
a data acquisition program 310. In one embodiment, the 
GPS system component 307, the inertial sensor com- 
ponent 308, and the portable computer 309 are con- 
nected together so that the data acquired by the GPS 
55 system component 307 and the inertial sensor compo- 
nent 308 can be stored on the portable computer hard 
drive. (In one alternative embodiment, a s condary 
GPS system may be used. The secondary GPS system 
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acquires GPS satellite time stamp data so that a corre- 
lation can be made between the data collected by the 
primary GPS system component 307 and the data col- 
lected by the inertial sensor component 308.) 
[0040] In one embodiment, the GPS system com- 
ponent 307 includes a DGPS unit manufactured by Ash- 
tech. Other suitable systems are commercially available 
from Garmin, Trimble, and Satloc. The inertial sensor 
component includes a gyroscope unit manufactured by 
KVH Industries (of New Jersey). Alternatively, a unit that 
combines a gyroscope and an accelerometer may be 
used. The portable computer may be Pentium ll-com- 
patible notebook computer. Suitable units from other 
manufacturers may be used. One process for collecting 
DGPS data is described in the copending application 
Ser. No. 08/834.652, filed April 11, 1997, the entire dis- 
closure of which is incorporated by reference herein. 
[0041] Referring to Figure 9, the vehicle 304 is 
shown being driven along roads 305 for which road 
position data are to be acquired. In a preferred embodi- 
ment, the vehicle 304 Is driven in a consistent, known 
position relative to the centerline of the road. For exam- 
ple, the vehicle 304 is driven in the center of the right- 
most lane of the road whenever possible. (In countries 
in which vehicles are driven on the left hand side of the 
road, the vehicle would be driven in the center of the 
leftmost lane.) 

[0042] As the vehicle 304 is being driven along the 
roads 305, the data acquisition program 310 (In Figure 
8) in the vehicle 304 acquires the data output by the 
sensors (Step 312 in Figure 7). These data are referred 
to as "raw sensor data." The raw sensor data include 
different kinds of data depending on the kind of sensor 
from which the data is output. The GPS component 307 
provides data indicating the geographic coordinates at a 
particular instant of time. The inertial sensory compo- 
nent 308 provides data indicating the acceleration of the 
vehicle at a particular instant of time. 
[0043] The sensor components may output data at 
regular or irregular intervals. Also, the different sensor 
components may output data at different rates. For 
example, the inertial system component 308 may output 
data every 0.1 second from whereas the GPS system 
component 307 may output data every 1 second. The 
data acquisition program 310 acquires the raw sensor 
data from the different sensor components and stores 
the raw sensor data. Each item of data that is stored by 
the data acquisition program is associated with a time 
stamp, or other means of chronological identification, 
that indicates when the data had been acquired. 
[0044] Figure 9 shows a plurality of positions, 
labeled with X's, extending along the road 305 upon 
which the vehicle 304 is being driven. Each of the 
labeled positions corresponds to one acquisition of raw 
sensor data by the data acquisition program 310 from 
the positioning system 306 indicating the position of the 
vehicle 304 as the vehicle 304 is driven along the road 
305. For various reasons, the data acquired by the posi- 



tioning system 306 may not represent the true position 
of the vehicle at the instant when the data was acquired. 
Some of these reasons include GPS signal interfer- 
ence, sensor drift, calibration errors, etc. 
5 [0045] In addition to acquiring data indicating the 
position of the vehicle, additional data are collected as 
the vehicle is being driven along the roads 305. For 
example, as the vehicle is being driven, the number of 
lanes of the road are recorded. Also, the lane widths are 
10 recorded. Road sign information may also be recorded 
along with the position along the road at which a road 
sign is located. The locations of points of interest along 
the road may be noted. Additional types of information 
that may be recorded include the speed limit, the 
15 address ranges, the street name, the type of road (e.g.. 
expressway, alley, etc.), the road surface, and so on. 
Some of this information may be recorded automatically 
and some of this information may be recorded using 
input from the researcher. The data acquisition program 
20 31 0 may include routines that allow some or all of these 
types of information to be saved using voice commands 
or using keyboard and/or pointing device input. If the 
data acquisition program 310 supports entry of data 
using voice commands, the data acquisition equipment 
25 303 includes the appropriate hardware and software, 
such as a microphone, speaker, and voice recognition 
software. The voice command features of the data 
acquisition program 310 may be similar or identical to 
those described in the copending patent application 
30 entitled "Method and System Using Voice Commands 
for Collecting Data for a Geographic Database," Ser. 
No. 09/335,122, filed June 17, 1999, the entire disclo- 
sure of which is incorporated by reference herein. Alter- 
natively, some or all of these types of information may 
35 be recorded using maps or written ledgers. 

[0046] In addition to acquiring data about the posi- 
tion of the vehicle as it is being driven, additional data 
may be acquired by other sensors in the vehicle. These 
other sensors may acquire data about the vehicle's 
40 heading and speed. These types of information may be 
associated with the vehicle position data and stored as 
data using the data acquisition program 310. In addi- 
tion, the vehicle may be equipped with a camera. The 
camera may be mounted to take pictures in front of, to 
45 the sides of, and/or behind the vehicle as it is being 
driven. The camera may take pictures on a regular basis 
(such as every 50 meters, or more frequently). The pic- 
tures may be stored as data and the positions of the pic- 
tures associated with the vehicle position data using a 
50 routine in the data acquisition program. 

[0047] Figure 1 0 shows an enlargement of a portion 
of one of the roads 305 along which raw sensor data 
indicating the vehicle position have been acquired. As 
shown in Figure 10, the raw sensor data include at least 
55 two types of data. One type of raw sensor data is GPS 
raw sensor data. The GPS raw sensor data have been 
acquired by the GPS system component 307 of the 
positioning system 306. A second type of raw sensor 
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data is inertial system raw sensor data. The inertial sys- 
tem raw sensor data are acquired by the inertial system 
component 308 of the positioning system 306. Note that 
the GPS data and the inertial system data may be 
acquired at different rates. Accordingly, there may be a 
greater number of one of these types of data than the 
other. For example, the GPS data may be acquired 
once per second whereas the inertial system data may 
be acquired once every 0.1 second. 
[0048] (Although Figure 10 shows two different 
kinds of raw sensor data, there may be more than two 
different kinds. For example, there may be other kinds of 
sensor data acquired by other types of sensor compo- 
nents, such as compass readings, odometer readings, 
speed pulse readings, etc. Each of these various sensor 
components may acquire data at different rates.) 
[0049] Referring again to Figure 7, after raw sensor 
data has been acquired for a portion of the road network 
and saved on a data storage device, such as the hard 
drive of the portable computer 308 located in the vehi- 
cle, post-processing steps are performed on the raw 
sensor data. These post-processing steps may be per- 
formed at the field office using programs installed on a 
computer (such as the programs 122 installed on the 
computer 121 in Figure 1). For example, one of these 
steps may include post-processing of the GPS data 
acquired while the vehicle was being driven using 
DGPS correction, if necessary (Step 319). With respect 
to the inertial system data, one of these post-processing 
steps may include deriving geographic coordinates from 
the acceleration data (Step 317). 
[0050] The next step is to fuse the post-processed 
raw sensor data (Step 320). The post-processed raw 
sensor data are fused using a program installed on a 
computer (such as the computer 121 in Figure 1) which 
may be located at the field office. The program that 
fuses the raw sensor data may be included among the 
programs 122 installed on one of the workstation com- 
puters 121 at the field office. Alternatively, the fusing 
step may be performed using a program installed on a 
computer located at another location. 
[0051] The fusing step 320 is described in connec- 
tion with Figure 11. Figure 11 shows the raw sensor 
data from Figure 10, including the raw GPS data and 
the raw inertial sensor data. In the fusing process, each 
of these data entries may be modified by taking into 
account another type of data. For example, each raw 
inertial sensor reading acquired between two raw GPS 
sensor data readings may be adjusted (i.e., latitude, lon- 
gitude, and optionally altitude modified) by the two raw 
GPS sensor data readings obtained before and after the 
raw inertial sensor reading. Further, the each raw iner- 
tia! sensor reading may be adjusted by the curvature 
data obtained before, during, and after the raw inertial 
sensor reading. Likewise, each raw GPS sensor read- 
ing may be modified taking into account the raw inertial 
sensor data readings before and after the GPS data 
acquisition. Also, each raw GPS sensor reading may be 



adjusted by the curvature data obtained before, during, 
and after the raw GPS sensor reading. As a result of this 
fusing step, each raw inertial sensor data reading and 
each raw GPS sensor reading is fused forming a fused 
5 sensor reading. Each fused sensor reading includes the 
same components, e.g., geographic coordinates 
(including altitude), curvature, and grade. 
[0052] Referring to Figure 7, according to one 
embodiment, after the fusing step, the fused data are 
10 smoothed (Step 330). The smoothing step 330 can be 
performed by a program on the same computer (i.e., 
computer 121 in Figure 1) that performed the fusing 
step or alternatively, the snroothing step may be per- 
formed on a different computer. The program that per- 
15 forms the smoothing step may be included among the 
programs 122 installed on one of the workstation com- 
puters 121 at the field office. The smoothing step 330 is 
described in connection with Figure 12. Programs, tech- 
niques, and algorithms for smoothing data points are 
20 known. One way to implement the smoothing is to use a 
least-squares fitted to a cubic equation. Another way to 
implement smoothing is to use a Kalman filter. The 
Kalman filter technique weighs each individual sensor 
error tolerance to determine how to smooth the points. 
25 Figure 1 2 shows the locations represented by the fused 
data readings. Using the smoothing algorithm, the fused 
data points are smoothed. The smoothing process 
results in a plurality (i.e., more than one) of smoothed 
data points. In one embodiment, each of the fused data 
30 points results in one smoothed data point. 

[0053] In an alternative embodiment, the fusing 
step 320 and the smoothing step 330 may be combined 
into a single fusing-smoothing step which is performed 
on the data at the same time. 
35 [0054] After the fused data are smoothed, the next 
step is to remove outliers. Removal of outliers is an 
optional step that may be omitted in some alternative 
embodiments of the data collection processes (300 in 
Figure 7). Referring to Figure 7, removal of outliers 
40 includes the steps of identifying the outliers (Step 340) 
and removing the outliers (Step 350). The outlier identi- 
fication and removal steps. 340 and 350. can be per- 
formed on the same computer that performed the fusing 
and smoothing steps 320 and 330, or alternatively, the 
45 outlier identification and removal steps may be per- 
formed on a different computer. The program that per- 
forms the steps of identifying outliers and then removing 
the outliers may be included among the programs 122 
installed on one of the workstation computers 1 21 at the 

50 field office. 

[0055] The outlier identification and removal steps 
340 and 350 are described in connection with Figures 
13 and 14. Figure 13 shows the smoothed data points 
from Figure 12 as well as the fused sensor data points 
55 from which the smoothed data points were derived. In 
the outlier identification process 340, each fused data 
point is evaluated relative to the smoothed data point 
derived therefrom. Various kinds of evaluation may be 
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used. One evaluation that may be used is to determine 
a distance between each fused data point and the 
smoothed data point derived therefrom. For each 
smoothed data point, this distance is compared to a 
configurable threshold distance. If the distance between 
the fused data point and the smoothed data point 
derived therefrom exceeds the threshold distance, the 
fused data point is identified as an outlier. Figure 13 
shows a fused data point that has been identified as an 
outlier because the distance between the fused data 
point to the smoothed data point derived therefrom 
exceeds a threshold distance. Using the outlier identifi- 
cation process 340, all the outliers included in the fused 
data points may be identified. 
[0056] Referring again to Figure 7, after the outliers 
in the original set of fused data have been identified, the 
outliers are removed, thereby forming a new set of 
fused data that excludes the outliers (Step 350). This 
new set of fused data is smoothed again. In one embod- 
iment, the same smoothing algorithm used to smooth 
the fused data the first time may be used again (Step 
330). Alternatively, the new set of fused data with the 
outliers removed may be smoothed using a different 
smoothing algorithm. Figure 14 illustrates application of 
the smoothing algorithm to the new set of fused data. As 
shown in Figure 14, the outlier identified in Figure 13 
has been removed. The smoothing algorithm is applied 
to the remaining fused data points. Because the outliers 
have been removed, the new smoothed curve resulting 
from the application of the smoothing algorithm to the 
new set of fused data points may be displaced from the 
previous smoothed curve. Likewise, a new set of 
smoothed data points, which lie along the new 
smoothed curve, may be displaced from the corre- 
sponding original smoothed data points. (Note that the 
new smoothed curve does not include a smoothed data 
point corresponding to the identified outlier.) 
[0057] The steps of identifying and removing out- 
liers (Steps 340 and 350) may be performed more than 
once. For example, after a new set of smoothed data 
has been prepared, outliers may be identified again 
using an evaluation of the displacement of each of the 
remaining fused data points from its corresponding new 
smoothed data point. When performing this evaluation, 
a threshold distance may be used that is the same as 
the threshold distance that was used the previous time, 
or alternatively, a different threshold distance may be 
used. 

[0058] The number of times that the outlier removal 
steps (Steps 340 and 350) are performed may be con- 
figurable. Alternatively, the outlier steps may be per- 
formed until no outliers are identified with a given 
distance threshold. 
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[0059] Referring to Figure 7, after the fused data 
are smoothed one or more times, the resulting set of 
smoothed fused data is used by the shape point forma- 
tion processes 301 to form shape point data for the pri- 
10 mary copy 1 00 of the geographic database (in Figure 1 ). 
One of these processes 301 is to automatically select 
which of the fused smoothed data to use to form shape 
points (Step 398). The step 398 of automatically select- 
ing shape points is performed by an automatic shape 
15 point generation program 400, some of the components 
of which are shown in Figure 15. The shape point gen- 
eration program 400 may be included among the pro- 
grams 122 installed on the one or more workstation 
computers 121 located at the field office. Alternatively, 
20 the shape point generation program 400 may be 
installed on another computer, such as the computer 
308 (in Figure 8) used to collect the road position data. 
[0060] Figure 15 shows the component steps per- 
formed by an embodiment of an automatic shape point 
25 generation program 400. The steps performed by the 
automatic shape point generation program 400 are part 
of the processes (301 in Figure 7) used to form shape 
point data from the smoothed fused data for the master 
copy 100 of the geographic database (in Figure 1). 
30 [0061] A first step (Step 410) performed by the 
shape point generation program 400 is to receive the 
smoothed fused data from the data collection processes 
300. If the shape point generation program 400 is 
installed on the same computer used to perform the 
35 steps of fusing and smoothing the raw data, this step 
may involve reading a file which is already stored on the 
computer. 

[0062] Another step (Step 420) performed by the 
shape point generation program 400 is to accept input 
40 parameters 426. These input parameters 426 may be 
provided to the shape point generation program 400 in 
two ways. One way is to specify an accuracy level. The 
accuracy level may be specified as a distance. For 
example, the accuracy level may be specified as 1 
45 meter, 5 meters, 0.5 meters, etc. 

[0063] There are various ways to determine the 
accuracy level to specify. According to one embodiment, 
the accuracy level is determined based upon the appli- 
cations that are expected to use the database products 
50 derived from the primary copy of the geographic data- 
base. The application that requires the greatest accu- 
racy is identified. Then, an accuracy level is specified 
which is consistent with the accuracy needed for this 
application. For example, if automatic vehicle control 
55 applications (such as obstacle warning and avoidance, 
curve warning, advanced cruise control, headlight aim- 
ing, and so on) require the greatest accuracy, then th 
level of accuracy for the master copy is specified to be 
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at least as accurate as the accuracy level needed for 
these automatic vehicle control applications. 
[0064] In one embodiment, for applications that 
require a high level of accuracy, a value between 
approximately 3 and 5 meters may be specified. For s 
applications that require a higher level of accuracy, a 
value between approximately 1 and 3 meters may be 
specified. For applications that require the highest level 
of accuracy, a sub-meter accuracy level (e.g.. 0.5) Is 
specified. For applications that require lower accuracy, io 
an accuracy level above 5 meters may be specified. 
[00S5] According to a present embodiment, the 
accuracy level specified to the automatic shape point 
generation program may Include two components: a 
planar tolerance component and a vertical tolerance is 
component. The planar tolerance component is used to 
define an accuracy level for the geographic database 
horizontally (i.e.. in a plane with respect to the surface of 
the earth, such as latitude and longitude). The vertical 
tolerance component is used to define a level of accu- 20 
racy for the geographic database vertically (i.e., alti- 
tude). The planar tolerance component and the vertical 
tolerance component may be set to the same value 
(e.g., "1 meter") or may be set to different values (e.g., 
"1 meter" for the planar component and "5 meters" for 25 
the vertical component.) (Alternatively, the planar 
and/or vertical tolerances may be specified as a relative 
values instead of absolute values. For example, the pla- 
nar and/or vertical tolerances may be specified as 10% 
and 15% respectively.) so 
[0066] An alternative way to provide input parame- 
ters is to specify types of predetermined inputs. Some 
of these types of predetermined inputs are designed to 
facilitate the consistent selection of appropriate param- 
eters. Examples of these types of predetermined inputs 35 
may include the following: 



ative change in any one of three dimensions, such 
as a change of a specified percent.) 

(3) . Directional change s in any combination of three 
dimensions. This input parameter provides that a 
shape point be generated any time there is a 
change of a specified distance in any combination 
of three dimensions. (Alternatively, this input 
parameter may specify that a shape point be gener- 
ated any time there is a relative change In any com- 
bination of three dimensions, such as a change of a 
specified percent) 

(4) . Road characteristics. This type of input relates 
an accuracy level for the shape point data to a char- 
acteristic of the road. To use this type of input 
parameter, a look up table (e.g., 434) is used that 
relates road characteristics to accuracy levels. 
Thus, if the road is characterized as an "express- 
way" a certain level of accuracy is used (e.g., 1 
meter). If the road is characterized as an "alley", a 
different accuracy level (e.g.. 5 meter) may be used. 
Other road characteristics that may be used include 
number of lanes, speed limit, surface (e.g., paved, 
gravel), and so on. 

(5) . Geographic area. This type of input relates an 
accuracy level for the shape point data to the city, 
state, country, etc., the road is located in. To use 
this type of input parameter, a look up table (e.g., 
434) is used that relates locations to accuracy lev- 
els. Thus, if the road is located in an unincorporated 
area, a different accuracy level may be used than if 
the road is located in a municipality. 

[0067] One or more of these parameters 426 may 
be specified into the shape point generation program 
400. If one of these parameters is not specified, the 
shape point generation program 400 may use a default 
value. After the input parameters 426 are received, 
some of these parameters may be matched to numeric 
values (Step 430), A look up table 434 may be used for 
this purpose. The look up table 434 includes accuracy 
values related to specified parameter entries. For exam- 
ple, a speed limit entry value of "55 mph" may corre- 
spond to a directional change value of "1 meter." 
[0068] After the desired level of accuracy is speci- 
fied for the resultant geographic database, the shape 
point generation program 400 runs a shape point gener- 
ation algorithm on the smoothed fused data (Step 440). 
The shape point generation algorithm determines which 
of the smoothed fused road position data to discard. 
The smoothed fused road position data that are dis- 
carded are unnecessary to provide the desired level of 
accuracy for the geographic database. The smoothed 
fused road position data that are not discarded are used 
to form shape point data for the geographic database. 
The smoothed fused road position data that are not dis- 
carded are necessary to provide the desired level of 
accuracy for the geographic database. 
[0069] The shape point generation algorithm oper- 



(1) . A database type . This type of Input relates an 
accuracy level for the shape point data to the type 

of application that is expected to use the geo- 40 
graphic database for which the shape data are 
being provided. To use this type of input parameter, 
a researcher specifies a type of database applica- 
tion, such as "drivers' assistance" or "navigation- 
related." As described further below, when a type of 45 
database application is specified, a look up table 
(e.g., 434, described below) is used that relates the 
type of database application to an accuracy level. 
Thus, if the researcher inputs "navigation", an accu- 
racy level of "5 meters" is specified. If the so 
researcher Inputs "drivers' assistance", an accu- 
racy level of "1 meter" is specified. 

(2) . Direqtional changes in any one of three dimen- 
sions. This input parameter provides that a shape 
point be generated any time there is a change of a ss 
specified distance in any one of three dimensions. 
(Alternatively, this input parameter may specify that 

a shape point be generated any time there is a rel- 
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ates on a road segment by road segment basis. Thus, 
the shape point generation algorithm determines which 
fused data points to discard with respect to one road 
segment before proceeding to determine which fused 
data points to discard with respect to the next road seg- 5 
ment. Accordingly, as an initial step, the fused data 
points corresponding to the nodes at the end points of a 
road segment are determined. As mentioned above in 
connection with Figure 5, in a database representation 
of a road segment, data attributes are stored to Indicate 
the locations of the end points (i.e., nodes) at each end 
of a road segment. Therefore, as part of the process of 
the shape point generation algorithm, the two fused 
data points located closest to the end points of the road 
segment being represented are identified and these two 
fused data points are indicated as being used to repre- 
sent the nodes at the end points of road segment. 
(These two fused data points are not marked for dis- 
carding.) 

[0070] After the fused data points corresponding to 
the end points of the road segment are identified, the 
shape point generation algorithm determines which 
fused data points located between these end points can 
be discarded. The shape point generation algorithm 
provides for a series of evaluations. In general, the 
shape point generation algorithm evaluates whether 
each smoothed fused data point deviates enough from 
a straight line generated from a previous data point so 
that a shape point corresponding to the data point being 
evaluated should be included in the database. The eval- 
uation process continues until all the fused data points 
that are located along the road segment being repre- 
sented are evaluated. 

[0071] The process used by the shape point gener- 
ation algorithm in order to determine which smoothed 
fused data to discard is described in connection with 
Figures 16A-16E. Figures 16A-16E illustrate application 
of the shape point generation algorithm to a series of 
snnoothed fused data points 455. The data points 455 in 
Figure 1 6A are the data points provided by the output of 
the data collection processes 300 (in Figure 7). These 
data points 455 represent the vehicle path (after 
smoothing, if performed) as the vehicle was being 
driven along the road, and hence the data points in Fig- 
ure 1 6A outline the geometry of the road upon which the 
vehicle was being driven. 

[0072] One of the points in Figure 16A is selected 
as a starting point of a straight line approximation of the 
represented road. In one embodiment, the point 
selected as the starting point is the point that coincides 
with the node at an endpoint of the road segment. In 
Figure 16A, the first point is selected as the starting 
point. From the starting point (i.e., the first point), a 
straight line is determined between the first point and 
the third point, skipping the intermediate (i.e., the sec- 
ond) point. The straight line connecting the first and 
third data points represents a proposed approximation 
of the road shape. This proposed approximation of the 



road shape is evaluated by the shape point generation 
algorithm to determine whether it satisfies the specified 
criterion (from Step 420) for the accuracy of the road. 
This evaluation includes determining the distance 
between the intermediate point and the straight line 
connecting the first and third points and then comparing 
this distance to a threshold distance. (This distance is 
calculated as the shortest distance and therefore is the 
distance along a line normal to the straight line connect- 
ing the first and third points.) The threshold distance is 
configurable (as described above) and specified by or 
derived from the input parameters 426 used to specify 
the level of accuracy of the geographic database. 
[0073] In Figure 16A, the distance between the 
intermediate point and the tine connecting the first and 
third points is less than the threshold distance. If the dis* 
tance between the intermediate point and the line con- 
necting the first and third points is less than the 
threshold distance, the intermediate point can be 
marked for discarding. Then, the shape point genera- 
tion algorithm proceeds to examine the next data point 
in the series 455. 

[0074] Referring to Figure 16B, the shape point 
generation algorithm calculates a straight line between 
the first data point and the fourth data point, skipping 
the intermediate data points (i.e., the second and third 
data points). The distance between the second data 
point and the straight line connecting the first and fourth 
data points is determined and compared to the thresh- 
old distance. Also, the distance between the third data 
point and the straight line connecting the first and fourth 
data points is determined and compared to the thresh- 
old distance. In Figure 16B, neither of these distances is 
greater than the threshold distance. Therefore, the third 
data point can be marked for discarding. Then, the 
shape point generatbn algorithm proceeds to examine 
the next data point in the series. 
[0075] Referring to Figure 16C, the shape point 
generation algorithm calculates a straight line between 
the first data point and the fifth data point. As before, the 
distances between each of the intermediate data points 
and the straight line are determined. In this case, the 
distances of the second, third, and fourth data points to 
the straight line are determined. Each of these dis- 
tances is compared to the threshold distance. As 
before, if none of these distances exceeds the threshold 
distance, the fourth data point can be marked for dis- 
carding and the next data point would be evaluated, and 
so on. However, in Figure 16C, the straight line distance 
between the third data point and the straight line con- 
necting the first and fifth data points exceeds the dis- 
tance threshold. When any one of the intermediate data 
points (i.e., the second, third, or fourth) is more distant 
from a straight line connecting the first and fifth points 
than the threshold distance, a determination is made 
that the path of the road is curved enough that the 
straight line representing the approximation of the road 
does not sufficiently describe the actual road shape. 
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Therefore, the Immediately previous data point in the 
series (i.e., in this case, the fourth data point) is deter- 
mined as being necessary so that a straight line con- 
necting the first and fourth data points sufficiently 
approximates the road shape, as shown in Figure 16D. 
(The fourth data point is necessary so that none of the 
intermediate data points, i.e., the second and third, is 
more distant from the straight line approximation than 
the distance threshold.) Thus, since the fourth data 
point is determined as necessary, the fourth data point 
is selected as a proto-shape point. (The selected data 
point is referred to as a "proto-shape point" because the 
data point may be modified by the automatic centerlin- 
ing program 500, described below.) The fourth data 
point and the first data point are marked as proto-shape 
points and data indicating their selection as proto-shape 
points are stored. If the automatic shape point genera- 
tion program 400 is being run on the computer worksta- 
tion 121 at the field office, the data indicating the status 
of these points as proto-shape points may be stored on 
the hard drive of the computer. In an alternative embod- 
iment, the proto-shape points may be stored separately, 
in a separate file and/or on a separate data storage 
device 460. 

[0076] The second and third data points in Figure 
1 6D are marked as discarded. This means that the data 
representing the second and third data points are not 
used in the formation of shape point data for the geo- 
graphic database 100. Data indicating the discarded 
status of these data points are stored. 
[0077] After the fourth data point has been selected 
as a proto-shape point, the fourth data point is used as 
the starting end of a new straight line approximation of 
the road shape. This is illustrated In Figure 16E. A 
straight line is formed connecting the fourth data point 
and the sixth data point, skipping the intermediate data 
point, i.e., the fifth data point. As before, the distance 
between the intermediate data point (i.e., the fifth data 
point) and the straight line approximation between the 
fourth data point and the sixth data point is compared to 
the threshold distance. If the distance exceeds the 
threshold distance, the fifth data point is selected as a 
proto-shape point. On the other hand, if the distance 
between the fifth data point and the straight line con- 
necting the fourth data point and the sixth data point 
does not exceed the threshold, a straight line approxi- 
mation is calculated between the fourth data point and 
the seventh data point, and so on. 
[0078] Using the above described process, all the 
fused data points along the road segment are evalu- 
ated. All the fused data points along a segment are eval- 
uated when the fused data point that coincides with the 
node at the far end of the road segment is encountered. 
As mentioned above, the data associated with the fused 
data point located at the far end of the road segment will 
be used to represent a node of the road segment in the 
resultant database. This fused data point will also be 
used to form the node at the starting end of the next 



segment. Therefore, th shape point generation algo- 
rithm uses this fused data point as the starting point for 
evaluation of the fused data points along the next road 
segment. The fused data points along the next segment 
5 are evaluated in the same manner as the fused data 
points were evaluated in the prior segment. In this man- 
ner, all the fused data points along all the road seg- 
ments are evaluated 

[0079] As the fused data points for all the road seg- 
10 ments are evaluated, those data points that are selected 
as "proto-shape points" and data indicating the status of 
these points as proto-shape points is stored in the data 
storage 460. 

[0080] Referring again to Figure 16D, it was stated 
15 above that the second and third data points in Figure 
16D are marked for discarding. Although the data repre- 
senting the second and third data points are not used in 
the formation of shape point data for the geographic 
database 100. they may not be actually thrown away. 
20 Instead, the second and third data points (along with the 
rest of the smoothed fused data, including the discarded 
data points) may be stored in a data archive 466. The 
data in the data archive 466 may be used at a later time 
to form different databases having different levels of 
25 accuracy. For example, if a database with a greater 
accuracy is desired at some later time, a different, lower 
threshold distance would be specified in the shape point 
generation program 400. Then, the shape point genera- 
tion program 400 would be run again using the 
30 smoothed fused data that had been stored in the data 
archive 466. When run with the lower distance thresh- 
old, some of the smoothed fused data points that had 
been marked as discarded the first time would be 
selected as proto-shape points when the shape point 
35 generation program is run again. (Similarly, some of the 
fused data points that had been selected the first time 
may not be selected the second time.) 
[0081] According to one embodiment of the shape 
point generation algorithm, an exception to the process 
40 described in Figures 16A-16E is made when the last 
fused data point located at the far end of a road seg- 
ment is encountered. As stated above, the last fused 
data point located at the far end of the road segment is 
used to form a node of the segment. Therefore, the data 
45 associated with this fused data point (corresponding to 
the far node of the segment) will be included in the geo- 
graphic database regardless of how close it is located to 
the immediately prior fused data point which had been 
determined as a proto-shape point. It may occur that the 
50 fused data point conresponding to the node at the far 
end of the road segment will be relatively close to that 
fused data point determined as a proto-shape point 
immediately prior to it. If this occurs, a balancing proc- 
ess is performed by the shape point generation algo- 
55 rithm. According to this balancing process, a new fused 
shape point is selected as the proto-shape point imme- 
diately prior to the end point. Th new fused data point 
is selected to balance the distances between the two 



12 



23 



EP 1 096 229 A1 



24 



proto-shape points immediately prior to the end point. 
To implement this balancing step, the second closest 
proto-shape point (determined by the evaluations per- 
formed by the shape point generation algorithm) prior to 
the far node is identified. Then, all the fused data points 
(including any fused data points that had already been 
marked for discarding) located between this point and 
the end point are evaluated. The data point located 
approximately halfway between the second closest 
proto-shape point and the end point is identified. 
[0082] The process described in connection with 
Figures 16A-16E relates to the planar component of the 
level of accuracy. With respect to the vertical compo- 
nent, a separate test is performed as each fused data 
point is evaluated. As each fused data point is evalu- 
ated, a change of altitude is calculated relative to the 
altitude of the previous fused data point that had been 
selected as a proto-shape point. If the change of altitude 
is greater than the specified vertical component of the 
level of accuracy, the immediately previous fused data 
point is selected as a proto-shape point so that the 
change in altitude between two proto-shape points does 
not exceed the vertical component of the specified level 
of accuracy. 

Alternative process for selection of shaoe data to dis- 
card 

[0083] An alternative process can be used when 
the fused data points represent a road along which the 
curvature direction reverses. (An S-shaped road is an 
example of a road along which the curvature direction 
reverses.) The process as described in connection with 
Figures 16A-16E can be used to select which fused 
data points to discard when the curvature of the road 
reverses. However, it may be preferable under some cir- 
cumstances to modify the process described in Figures 
16A-16E when the fused data points represent a road 
along which the curvature reverses direction. In a data 
representation of a road along which a reversal of cur- 
vature direction occurs, It would be preferable to identify, 
as closely as possible, that point at which the curvature 
reverses direction. Accordingly, it may be preferable to 
select as a proto-shape point that fused data point that 
is closest to the location at which the direction of curva- 
ture reverses even if the fused data point would not oth- 
enwise be selected as a proto-shape point. 
[0084] An example of how this alternative process 
is applied is shown in Figures MAAlE. Figure 17A 
shows a series of fused data points 456. These fused 
data points follow an S-shaped path. As in the embodi- 
ment described in connection with Figures 16A-16E, 
one of the points in Figure 17A is selected as a starting 
point of a straight line approximation of the represented 
road. From the starting point (i.e., the first point), a 
straight line is determined between the first point and 
the third point, skipping the intermediate (i. ., the sec- 
ond) point. The straight line connecting the first and 



third data points represents an approximation of the 
road shape which is evaluated to determine whether it 
satisfies the specified criterion for the accuracy of the 
road. As in the previously described process, this evalu- 

5 ation includes determining the distance between the 
intermediate point and the straight line connecting the 
first and third points and then comparing this distance to 
a threshold distance. If the distance between the inter- 
mediate point and the line connecting the first and third 

10 points is less than the threshold distance the next point 
in the series is evaluated. 

[0085] Figure 17B shows a straight line connecting 
the first and fourth points. This embodiment of the 
shape point generation algorithm performs an evalua- 

15 tlon of the distances between both the intermediate 
points (i.e., the second and third points) and the straight 
line connecting the first and fourth points. In Figure 17B, 
neither of these distances is greater than the threshold 
distance. The shape point generation algorithm pro- 

20 ceeds to examine the next data point in the series. 
[0086] In Figure 17C, the shape point generation 
algorithm calculates a straight line between the first 
data point and the fifth data point. Note that in Figure 
17C, the third and fourth data points are on the opposite 

25 side of the straight line connecting the first and fifth data 
points. As before, the distances between each of the 
intermediate data points (i.e., the second, third, and 
fourth points) and the straight line are determined. Each 
of these distances is compared to the threshold dis- 

30 tance. As before, if none of these distances exceeds the 
threshold distance, the next data point would be evalu- 
ated, and so on. However, in Figure 17C, the straight 
line distance between the fourth data point and the 
straight line connecting the first and fifth data points 

35 exceeds the distance threshold. When any one of the 
intermediate data points is more distant from a straight 
line connecting the first and fifth points than the thresh- 
old distance, a determination is made that the path of 
the road is curved enough that the straight line repre- 

40 senting the approximation of the road does not suffi- 
ciently describe the actual road shape. Therefore, the 
point at which the curvature changed (i.e., in this case, 
the third data point) is selected as a "proto"- shape 
point, as shown in Figure 17D. The third data point and 

45 the first data point are marked as proto-shape points 
and data indicating their selection as proto-shape points 
are stored. The second data point in Figure 17D is 
marked as discarded, as described in connection with 
the previous embodiment. 

50 [0087] After the third data point has been selected 
as a proto-shape point, the third data point is used as 
the starting end of a new straight line approximation of 
the road shape. This is illustrated in Figure 17E. A 
straight line is formed connecting the third data point 
55 and the fifth data point, skipping the intermediate data 
point, i.e., the fourth data point. As before, the distance 
between the intermediate data point (i.e., the fourth 
data point) and the straight line approximation between 
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the third data point and the fifth data point is compared 
to the threshold distance. If the distance exceeds the 
threshold distance, the fourth data point is selected as a 
proto-shape point. On the other hand, if the distance 
between the fourth data point and the straight line con- 
necting the third data point and the fifth data point does 
not exceed the threshold, a straight line approximation 
is calculated between the third data point and the sixth 
data point, and so on. The process continues until all 
the smoothed fused data points are evaluated. 
[0088] As before, the data not used in the formation 
of the geographic database may be stored in a data 
archive and used at a later time to form different data- 
bases having different levels of accuracy. 
[0089] The process described in connection with 
Figures 17A-17E may be used as a substitute for, in 
addition to, or as a supplement to the process described 
in Figures 16A-16E. 

Another alternative process for selection of shape data 

to discard 

[0090] The process described in connection with 
Figures 16A-16E is one way that the shape point gener- 
ation algorithm can use to determine which smoothed 
fused data to discard. An alternative process is 
described in connection with Figures 18A-18E. Like the 
process described in connection with Figures 1 6A-1 6E, 
the process described in Figures 18A-18E includes a 
series of evaluations by the shape point generation 
algorithm to determine which smoothed fused data 
points deviate enough from a straight line generated 
from a previous data point so that a shape point should 
be included in the database. 

[0091] Figure 18A shows the same series of fused 
data pointe 455 that are shown in Figures 16A-16E. 
One of the points in Figure 18A is selected as a starting 
point of a straight line approximation of the represented 
road. From the starting point (i.e., the first point), a 
straight line is determined between the first point and 
the third point, skipping the intermediate (i.e., the sec- 
ond) point. The straight line connecting the first and 
third data points represents an approximation of the 
road shape which is evaluated to determine whether it 
satisfies the specified criterion for the accuracy of the 
road. As in the process described in connection with 
Figures 16A-16E, this evaluation includes determining 
the distance between the intermediate point and the 
straight line connecting the first and third points and 
then comparing this distance to a threshold distance. If 
the distance between the intermediate point and the line 
connecting the first and third points is less than the 
threshold distance the next point in the series is evalu- 
ated. 

[0092] Figure 18B shows a straight line connecting 
the first and fourth points. This embodiment of the 
shape point generation algorithm performs an evalua- 
tion of the distances between both the intermediate 
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points (I.e., the second and third points) and the straight 
line connecting the first and fourth points. In Figure 18B, 
neither of these distances is greater than the threshold 
distance. The shape point generation algorithm pro- 

5 ceeds to examine the next data point in the series. 
[0093] In Figure 18C, the shape point generation 
algorithm calculates a straight line between the first 
data point and the fifth data point. As before, the dis- 
tances between each of the intermediate data points 

10 (i.e., the second, third, and fourth points) and the 
straight line are determined. Each of these distances is 
compared to the threshold distance. As before, if none 
of these distances exceeds the threshold distance, the 
next data point would be evaluated, and so on. How- 

15 ever, in Figure 1 8C, the straight line distance between 
the third data point and the straight line connecting the 
first and fifth data points exceeds the distance thresh- 
old. When any one of the intermediate data points is 
more distant from a straight line connecting the first and 

20 fifth points than the threshold distance, a determination 
is made that the path of the road is curved enough that 
the straight line representing the approximation of the 
road does not sufficiently describe the actual road 
shape. Therefore, the point that exceeded the threshold 

25 distance (i.e., in this case, the third data point) is 
selected as a "proto"- shape point, as shown in Figure 
18D. The third data point and the first data point are 
marked as proto-shape points and data indicating their 
selection as proto-shape points are stored. The second 

30 data point in Figure 18D is marked as discarded, as 
described in connection with the previous embodiment. 
(If more than one intermediate point exceeded the 
threshold distance, the first of these points would be 
chosen as the proto-shape point.) 

35 [0094] After the third data point has been selected 
as a proto-shape point, the third data point is used as 
the starting end of a new straight line approximation of 
the road shape. This is illustrated in Figure 18E. A 
straight line is formed connecting the third data point 

40 and the fifth data point, skipping the intermediate data 
point, i.e.. the fourth data point. As before, the distance 
between the intermediate data point (i.e., the fourth 
data point) and the straight line approximation between 
the third data point and the fifth data point is compared 

45 to the threshold distance. If the distance exceeds the 
threshold distance, the fourth data point is selected as a 
proto-shape point. On the other hand, if the distance 
between the fourth data point and the straight line con- 
necting the third data point and the fifth data point does 

50 not exceed the threshold, a straight line approximation 
is calculated between the third data point and the sixth 
data point, and so on. The process continues until ail 
the smoothed fused data points are evaluated. 
[0095] As stated in connection with Figures 1 BA- 
SS 16E, the data not used in the formation of the geo- 
graphic database may be stored in a data archive and 
used at a later time to form different databases having 
different levels of accuracy. 
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[0096] The processes described in connection witli 
Figures 16A-16E and Figures 18A-18E may be used in 
combination. Both these processes may be run on a 
coflection of smoothed raw data and the results com- 
pared for size, accuracy, smoothness, etc. The process s 
that provides the best results, based upon specifiable 
criteria, may be used, 

[0097] The process described in connection with 
Figures 18A-18E may also be used with the process 
described in Figures 17A-17E. io 
[0098] In alternative embodiments, one of these 
processes may be used for some kinds of databases 
and the other of these processes may be used for other 
kinds of databases. In another alternative, both these 
processes may be used in the same database. For is 
example, one process may be used for some areas or 
types of roads and the other process may be used for 
other areas or types of roads. 

B. Forming ghg^pg point datg - automatic cgntgrllning 20 
(1) Without Q^nt^rlinmg 

[0099] Referring again to Figure 7, after the shape 
point generation program 400 has evaluated all the 25 
fused data points and determined which of the fused 
smoothed data to use as proto-shape points, the proto- 
shape points can be used to form shape point data for 
the geographic database 100. In one embodiment, the 
proto-shape points determined by the shape point gen- 30 
eration program 400 may be used directly as shape 
points in the geographic database 100 (Steps 470 and 
472 in Figure 7). To use the selected data points directly 
as shape points, a database updating program 474 
(shown in Figure 19) is used. The database updating 35 
program 474 may be installed on one of the computer 
workstations 121 located at the field office 118 (as 
shown in Figure 1). Alternatively, the database updating 
program 474 may be installed on another computer, 
such as the portable computer 308 used to collect data 40 
while driving along roads. The computer upon which the 
database updating program 474 is installed includes the 
appropriate hardware and software so that it can be 
connected to the network 124 in order to exchange data 
with the main computer 116. The computer upon which 45 
the database updating program 474 is installed has 
access to the fused smoothed shape point data that has 
been selected as proto-shape point data. The database 
updating program 474 may be similar to the program 
described in the copending patent application entitled so 
"Method and System for Collecting Data for Updating a 
Geographic Database," Ser. No. 09/256,389, filed Feb- 
ruary 24, 1999, the entire disclosure of which is incorpo- 
rated by reference herein. 

[0100] The database updating program 474 pro- ss 
vides for adding, modifying, and deleting records in the 
main copy 100 of the geographic database. If the proto- 
shape points relate to roads that are not already repre- 



sented in the geographic database, the database updat- 
ing program 474 provides for creating new data records 
that are stored in the main copy 1 00 of the geographic 
database to represent these roads. The proto-shape 
point data are added as shape point data in the new 
records formed to represent these roads in the main 
copy 100 of the database. If the proto-shape points 
relate to roads that are already represented by data 
records in the main copy 100 of the geographic data- 
base, the database updating program 474 provides for 
modifying the existing data records in the main copy 
1 00. The existing records are modified to add the proto- 
shape point data as shape point data. These modifica- 
tions are performed on the primary copy of the geo- 
graphic database over the network 124. 

(2) With Q m ^rf ' mm 

[0101] In a preferred embodiment, the proto-shape 
points determined by the automatic shape point gener- 
ation program 400 are modified prior to being added as 
shape point data in the main copy 100 of the geographic 
database. According to this embodiment, the proto- 
shape points determined by the automatic shape point 
generation program 400 are modified by adjusting them 
to coincide with the centerline of the represented road 
(Steps 470 and 498 in Figure 7). This process 498 may 
performed be by an automatic centerlining program 
500. The automatic centerlining program 500 modifies 
the coordinates of the proto-shape points to take into 
account the position of the vehicle as the raw sensor 
data were being collected. As mentioned above, when 
the vehicle is being driven to collect data (Steps 302 
and 312 in Figure 7), it is driven in a consistent, known 
position on the road. As stated above, the vehicle 304 is 
preferably driven in the rightmost lane. Because the 
vehicle was driven in the rightmost lane when the raw 
sensor data were being acquired, the smoothed fused 
data (derived therefrom) represent the position of the 
rightmost lane. However, as mentioned above in con- 
nection with Figures 4-6, when shape point data are 
stored to represent a shape of a road, the shape point 
data correspond to positions along the centerline of the 
represented road (or the centerline of the lanes in one 
direction of a road represented by separate sets of 
shape points for each direction). Thus, when road posi- 
tion data are collected by the vehicle traveling along a 
road, the collected data does not correspond to the way 
that the road is represented in the geographic database. 
[0102] To account for this difference, the automatic 
centerlining program 500 modifies the proto-shape 
point data. More specifically, the automatic centerlining 
program 500 calculates new coordinates using the for 
each of the proto-shape points thereby shifting the 
points to take into account the position of the vehicle as 
the raw sensor data were being collected. In this proc- 
ess, 

[0103] The component steps of one embodiment of 
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the automatic centerlining program are shown In Figure 
19. 

[0104] Referring to Figure 19, an initial step per- 
formed by the automatic centerlining program 500 is to 
receive the proto-shape points as input (Step 510). The 
automatic centerlining program 500 can be operated in 
several different modes. In one mode, input parameters 
518 are provided to specify the shift distance. These 
input parameters 518 may be may provided in several 
different ways. 

[0105] One way to provide these input parameters 
518 is to have the researcher specify the number of 
lanes and the lane width. The automatic, centerlining 
program may include a menu for this purpose. This 
approach may be selected when all the lanes are known 
to have the width or when all the lanes are estimated to 
have the same width. Based upon specification of the 
number of lanes and the lane width, a shift distance is 
determined. The shift distance is equal to the width of 
each lane times the number of lanes divided by two, 
minus one half the lane width. For example, for a four 
lane road in which each lane is 8 feet in width, the shift 
distance would be 12 feet. If the road has a median, half 
the width of the median would be added to the shifted 
distance. Note that for roads that are represented by 
separate sets of shape points (such as the road 211 
illustrated in Figure 6), the number of lanes would 
include only those lanes located on one side of the 
median. 

[0106] A second way to provide these input param- 
eters 518 is to have the researcher specify a lane width 
for each lane. This approach may be selected when the 
researcher knows the width of each lane. Based upon 
the specification of the lane width for each lane, a shift 
distance is determined. The shift distance would be 
equal to the sum of alt the lane widths divided by two, 
minus one half the width of the rightmost lane. If the 
road is a median, half the width of the median would be 
added to the shift distance. 

[01 07] Still another way to provide input parameters 
518 Is to specify a shift distance. If a shift distance is 
provided as an input, this distance is used instead of 
calculating a shift distance, as described above. 
[0108] According to another mode of operation of 
the automatic centerlining program 500, the shift dis- 
tance may be determined automatically. In order to 
operate in this mode, information about the road, such 
as the number of lanes and width of each lane need to 
have already been stored as attributes of the road. 
Using the values of these attributes, a centerline shift 
can be calculated using the formula described above. 
[0109] Using the input parameters 518, new coordi- 
nates are determined for each of the proto-shape points 
(Steps 530). The coordinates of the new point deter- 
mined for each proto-shape point are calculated so that 
the new point coincides with the centerline of the repre- 
sented road. When determining the coordinates of the 
new point, the curvature of the road is taken into 



account. According to one embodiment, a tangent of the 
curvature is approximated at each proto-shape point. 
The tangent at a proto-shape point may be approxi- 
mated by determining a straight line between the proto- 

5 shape point and that proto-shape point located immedi- 
ately before the proto-shape point. After the tangent is 
approximated, a line normal to the tangent is deter- 
mined through the proto-shape point. The new data 
point is located along this normal line. Specifically, the 

10 new data point is located at the centerline shift distance 
from the proto-shape point along the normal line. The 
direction at which the new data point is location along 
the normal line from the proto-shape point is deter- 
mined taking into account the direction of travel of the 

75 vehicle (which can be determined by the order in which 
the raw sensor points were acquired). Using the direc- 
tion of vehicle travel, the new data point is located in the 
left direction (relative to the vehicle direction of travel) 
along the normal line (for countries in which traffic trav- 

20 els on the right sides of roads). These steps are illus- 
trated in Figures 20A-20D. 

[01 1 0] In Figure 20A, a series of proto-shape points 
602 is shown. Also shown is an outline of the road along 
which the raw sensor data were acquired and from 

25 which the proto-shape points were derived. The auto- 
matic centerlining program 500 evaluates each proto- 
shape point one at a time and determines a new data 
point. For example, starting with the proto-shape point 
labeled 604, a tangent of the curvature at the point is 

30 approximated, as shown in Figure 20B. The tangent is 
approximately by determining a straight line between 
the proto-shape point 604 and the proto-shape point 
located immediately prior thereto. This prior proto- 
shape point is labeled 606. Then, a line normal to the 

35 tangent at the proto-shape point 604 is determined, as 
shown in Figure 20C. Using the shift distance (which is 
either input or derived from the attributes of the road), 
the coordinates of a new data point are determined at 
the shift distance from the coordinates of the proto- 

40 shape point along the normal line, as shown in Figure 
20D. The coordinates of the new data point are stored. 
The automatic centerlining program then evaluates the 
next proto-shape point in order to determine a new data 
point and so on until new data points are determined for 

45 each of the proto-shape points. 

[0111] Referring to again to Figure 19, after all the 
new data points which are determined by the automatic 
centerlining program 500, these new data points are 
provided to the database updating program 474. The 

50 new data points are stored as shape points in the pri- 
mary copy 100 of the geographic database by the data- 
base updating program 474 in the manner described 
above. 

[0112] An alternative process for the centerline 
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shifting step 530 of the automatic centerlining program 
are shown in Figures 21 A-21 D. Lil<e the steps described 
in connection with Figures 20A-20D, the steps shown in 
Figures 21 A-21 D determine new data points for each of 
the proto-shape points. According to the embodiment 5 
shown in Figures 21 A-21 D, a curvature of the road is 
determined at each proto-shape point. The curvature at 
each proto-shape point may be determined in several 
different ways. One way to determine the curvature at 
the location of a proto-shape point is to calculate the 10 
curvature taking into account the one or more proto- 
shape points that are located before and after the proto- 
shape point. Alternatively, the automatic shape point 
generation program may use the value of the curvature 
data acquired by the sensors (e.g., the inertial sensors 15 
308 in Figure 8) which is associated with the proto- 
shape point data for that point if available. A radial line is 
determined through the center of the curve correspond- 
ing to the curvature and the proto-shape point. The new 
data point is located along this radial line. Specifically, 20 
the new data point is located at the centerline shift dis- 
tance from the proto-shape point along the radial line. 
As in the previously described embodiment, the direc- 
tion at which the new data point is located along the 
radial line from the proto-shape point is determined tak- 25 
ing into account the direction of travel of the vehicle. 
Using the direction of vehicle travel, the new data point 
is located in the left direction (relative to the vehicle 
direction of travel) ak>ng the radial line (for countries in 
which traffic travels on the right sides of roads). These 30 
steps are illustrated in Figures 21 A-21 D. 
[0113] In Figure 21 A, the series of proto-shape 
points 602 from Figure 20A is shown. As in the previous 
embodiment, the automatic centerlining program 500 
evaluates each proto-shape point one at a time and 35 
determines a new data point. For example, starting with 
the proto-shape point labeled 604, a curvature at the 
point is determined, as explained above. The curvature 
is illustrated in Figure 21 B. Then, a radial line through 
this curve is determined, as shown in Figure 21 C. Using 4o 
the shift distance (which is either input or derived from 
the attributes of the road), a new data point is deter- 
mined at the shift distance from the proto-shape point 
along the radial line, as shown in Figure 21 D. The coor- 
dinates of the new data point are stored. The automatic 45 
centerlining program then evaluates the next proto- 
shape point in order to determine a new data point and 
so on until new data points are determined for each of 
the proto-shape points. 

50 

V. ALTERNATIVE EMBODIMENTS 
A. Centerline first embodiment 

[0114] In the first embodiment described above, 55 
proto-shape points are determined from the fused raw 
data points and then, after the proto-shape points are 
determined, the shape points are determined by calcu- 
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lating new data points at locations that are shifted to the 
centerline from the positions of the proto-shape points. 
In an alternative embodiment, these steps can be 
reversed. According to this alternative embodiment, the 
fused raw points are shifted to the centerline first. Then, 
these data points, which are now located along the cen- 
terline of the represented road, are evaluated using the 
shape point generation program 400 in order to deter- 
mine which of these points to discard and which of 
these points to use as shape points in the primary copy 
of the geographic database. 

8. Ac?quire ciat^ from bQth $ideg of roe^^ 

[0115] In the first embodiment described above, 
raw sensor points are acquired driving along only one 
side of the road. Driving along only one side of the road 
to acquire road position data may be preferable 
because it is more efficient. However, in an alternative 
embodiment, vehicle position data can be acquired driv- 
ing in both directions along the road. If vehicle position 
data are acquired driving in both directions along the 
road, a centerline can be determined by calculating a 
line halfway between the vehicle paths in each direction. 
If vehicle position data are acquired driving in both 
directions, the fused raw data for each direction may be 
shifted to the centerline first. These fused raw data may 
be very dense because they represent data acquired in 
both directions. These fused raw sensor data that are 
shifted to the centerline may be evaluated using the 
shape point generation program 400 to determine which 
of the data to discard and which of the data to use as 
shape point data. 

C. Producing (jgrivetf dftt^togg prpcluctg Qf iQWgf acgg- 
racy levels 

[01 16] In the first embodiment described above, the 
automatic shape point generation program may be used 
to generate shape points for a database of any specified 
accuracy. For example, a high level ot accuracy may be 
specified, such as 1 meter. After shape point data for 
the primary copy of the geographic database are stored 
with this accuracy level, a derived database product 
(such as one of the products 110 in Figure 2) can be 
produced. This derived database product can be used 
in an application that requires the high level of accuracy, 
such as obstacle warning and avoidance, curve warn- 
ing, advanced cruise control, headlight aiming, and so 
on. If a derived database product having a lower level of 
accuracy (e.g., 10 meters) would suffice for a different 
kind of application (such as Internet map displays), a 
derived database product having a lower level of accu- 
racy can be formed from a primary copy having a higher 
level of accuracy. In order to form a derived database 
product of a lower accuracy from a primary copy that 
has a high level of accuracy, the automatic shape point 
generation program 500 can be run using the shape 
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point data in the primary copy as input. Wlien the auto- 
matic shape point generation program is run with the 
primary copy as input, a level of accuracy is specified 
which is lower than the level of accuracy of the primary 
copy For example, if the primary copy was formed with 
a level of accuracy of 1 meter, a lower level of accuracy 
would be specified to the automatic shape point gener- 
ation when it is run using the shape point data in the pri- 
mary copy as input. When run in this manner, the 
automatic shape point generation program treats the 
actual shape point data as if it were fused raw sensor 
data. The automatic shape point generation program 
would discard those shape points that are not needed to 
provide the lower level of accuracy According to this 
embodiment, it would be preferable to form the primary 
copy with the highest level of accuracy that would be 
expected to be needed, and then specify lower levels of 
accuracy for each derived database product. 

n Altprnativfi method n f production nf derived database 
products of lower ac curacy levels 

[0117] An alternative to the foregoing embodiment 
would be to form multiple versions of the primary copy 
of the geographic database from the fused raw data. 
According to this alternative, instead of applying the 
automatic shape point generation program to the data in 
the primary copy in order to form derived database 
products of lower levels of accuracy, separate versions 
of primary copies would be prepared from the raw fused 
sensor data for different applications. This alternative 
has the advantage that the raw fused sensor data are 
used to form primary copies of databases of a desired 
level of accuracy for each specific application. 



once the vehicle position data are acquired, they are 
processed into shape point data in the same manner as 
described above. When vehicle position data are 
acquired by an end user, a verification process may be 

5 used to check the validity of the data (for example, to 
check whether the end user's vehicle may have left the 
road). When road position data are acquired by end 
users, a statistical analysis process may be used to 
refine the data. A method for acquiring road position 

10 data using end users' vehicles is described in copend- 
ing application Ser. No. 08/951,767, the entire disclo- 
sure of which is incorporated herein by reference. 
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F r^nllftptinn hy end users 

[01 1 8] In the embodiments described above, it was 
stated that road position data were collected by a 
researcher driving a vehicle in which a positioning sys- 
tem was installed along roads in a geographic area. In 
an alternative embodiment, the road position data may 
be collected by someone other than a researcher. For 
example, a positioning system that collects road posi- 
tion data may be installed in a vehicle which is used and 
driven by an end user. The end user may be a commer- 
cial user (e.g., a taxi cab driver or a delivery truck driver) 
or alternatively, the end user may be a non-commercial 
user. According to this alternative, a data storage sys- 
tem operates to collect the vehicle position data as the 
vehicle is being driven along roads, in the same manner 
as in the embodiment described above in which the 
researcher was driving the vehicle. The data storage 
system may be located in the vehicle or may be located 
remotely If the data storage system is located remotely, 
the vehicle position data are transmitted by a wireless 
communication system to the remote location at which 
the data storage system is located. In this embodiment, 



[01 1 9] The embodiments of the data collection sys- 
tems described above can be used to collect data relat- 
ing to the positions of roads in a geographic area. 
According to one embodiment, while a researcher is 
20 driving a vehicle along a road to collect data relating to 
the positions of roads, the research may also be obtain- 
ing data relating to other road attributes. These other 
road attributes include signage (e.g., signs along the 
road), speed limits, addresses and address ranges. 
25 street names, number of lanes, turn signals, lane divid- 
ers, road surface composition, stop lights, stop signs, 
etc. Embodiments of systems and methods for collec- 
tion some of these kinds of road attributes are 
described in copending applications, Ser. Nos. 
30 09/256,389 and 09/335,122, the entire disclosures of 
which are incorporated by reference herein. 

a Varinus other alternatives 

35 [01 201 on® embodiment, the data acquisition pro- 
gram 310, the automatic shape point generation pro- 
gram and the automatic centerlining program are 
written in the C programming language. In alternative 
embodiments other programming languages may be 

40 used, such as C++, Java, Visual Basic, and so on. 
[01 21] In the first embodiment described above, the 
raw sensor data were fused and smoothed using a 
least-squares fitted to a cubic equation. In alternative 
embodiments, other types of smoothing and filtering 

45 techniques may be used. In yet another alternative, no 
filtering of the raw sensor data may be performed. 
[0122] In the first embodiment, both automatic 
shape point generation and automatic centerlining were 
performed to produce shape point data from the fused 

50 sensor data. In alternative embodiments, the automatic 
centerlining program can be used without the automatic 
shape point generation program. For example, the out- 
put of the automatic shape point generation program 
can be stored as shape point data in the primary copy of 

55 the database without shifting the data points to the cen- 
terline. Alternatively the data points determined by the 
automatic shape point generation program can be 
shifted to centerline positions by a means other than the 
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automatic centerline program. 
[0123] Likewise, the automatic shape point genera- 
tion program can be used without the automatic center- 
lining program. For example, the automatic centerlining 
program can be used on the raw sensor data points 
without having the automatic shape point generation 
program process the raw sensor data points first. These 
raw sensor points —now aligned on the centerline of the 
road segment — can be stored as shape points. Alterna- 
tively, the automatic centerlining program can be used 
on shape point data that had not been processed by the 
automatic shape point generation program. 
[0124] In the first embodiment described above, a 
method for automatic generation of shape points was 
disclosed that identifies those raw data points neces- 
sary to include as shape points in a geographic data- 
base in order to provide a desired level of accuracy. 
Various other methods can be used to evaluate these 
raw data points and the automatic shape point genera- 
tion program can employ other algorithms or techniques 
for this purpose. 

[0125] In a present embodiment, the shapes of 
other-than-straight road segments are represented by 
shape point data that include data indicating the geo- 
graphic coordinates of one or more points along the 
road segment located between its endpoints, and which 
additionally may include other data, such as data indi- 
cating the curvature of the represented road segment at 
the locations of the points, data indicating the road 
grade at the location of the points, etc. In addition to 
these kinds of data, there are other ways to represent 
other-than-straight road segments. Some of these other 
ways to represent other-than-straight roads include 
splines (including Bezier curves), clothoids, etc. One 
way to implement a representation of an other-than- 
straight road segment is disclosed in Ser. No. 
08/979,211, filed November 26, 1997, the entire disclo- 
sure of which is incorporated by reference herein. 
Embodiments of the automatic shape point generation 
program, disclosed above, can be used with any of 
these other kinds of representations. Likewise, embodi- 
ments of the automatic centerlining program, disclosed 
above, can be used with any of these other kinds of rep- 
resentations. 

[0126] In the shape point evaluation algorithm 
described above, it was stated that the point selected as 
the starting point for evaluation by the shape point gen- 
eration algorithm was the point that coincides with the 
node at an endpoint of the road segment. In alternative 
embodiments, the shape point evaluation algorithm can 
start at any point including any point located between 
the end points of a segment. 

[0127] In some of the embodiments of the shape 
point evaluation algorithm described above, it was 
stated that intermediate fused data points were evalu- 
ated by determining the distance of each intermediate 
fused data point to a straight line connecting the fused 
data points on either side of the intermediate points and 



then comparing these distances to a threshold distance. 
In an alternative embodiment, instead of comparing the 
distance of each intermediate point to a threshold dis- 
tance, the curvature at each successive point can be 

5 compared to a percentage threshold, e.g., ±10%, of the 
curvature of a prior point. If the curvature at a succes- 
sive point is outside the percentage threshold of the cur- 
vature of the prior point, the curvature of the prior point 
no longer sufficiently describes the shape of the path 

10 and a proto-shape point is selected, as described in 
connection with the prior embodiments. With this alter- 
native embodiment, the proto-shape points are selected 
so that the difference in curvature between any two 
adjacent proto-shape points does not exceed the 

15 selected percentage threshold. 

[0128] As mentioned above, a camera may be 
located in the vehicle and operated to collect images 
around the vehicle as the vehicle is being driven along 
the roads to collect position and curvature data. The 

20 images from the camera may be used for various pur- 
poses. For example, the images from the camera may 
be used in conjunction with the automatic centerlining 
program to determine the centerline shift distance. In 
another example, images from the camera may be used 

25 to correct the shape point location to account for devia- 
tions by the vehicle from the center of the rightmost 
lane, e.g., to avoid an obstacle or to make a turn. 

VI. ADVANTAGES 

30 

[0129] The present system and method provide for 
collecting data for a geographic data efficiently and 
quickly. The disclosed systems and methods provide for 
the consistent and accurate determination of road posi- 
35 tion data for a geographic database. The disclosed sys- 
tems and methods take advantage of the high levels of 
accuracy that can be provided by the sensor equipment 
and ensure that the high levels of accuracy are main- 
tained in the geographic data derived from this sensor 
40 equipment. 

[0130] An advantage associated with the disclosed 
embodiments is that the high accuracy that can be 
obtained at the sensor level is maintained in the data- 
base product formed therefrom while relying on soft- 
45 ware programs that automatically adjust road 
geometries and automatically generate shape and cur- 
vature, 

[0131] The present system and method provide for 
the production of various database products each with a 
50 level of accuracy tailored to the application for which the 
specific product will be used. As mentioned above in 
connection with Figure 2, various different kinds of data- 
base products 110 may be produced using the master 
copy 100 of the geographic database. Each of these dif- 
55 ferent database products 110 may include shape point 
data (e.g., shape point data 222(3) described in connec- 
tion with Figure 5), in order to represent the shapes of 
other than straight roads. However, the number of 
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shape points needed by each of these different data- 
base products to represent other than straight roads 
may be different. These differing needs result from the 
different purposes for which each of these different 
database products is used. Some database products 
are used in applications that require greater accuracy, 
and therefore, such database products may require a 
greater number of shape points to more accurately rep- 
resent other-than-straight roads. On the other hand, 
other database products are used in applications that 
require less accuracy, and therefore these database 
products may require a lesser number of shape points 
to represent other-than-straight roads. For an applica- 
tion that requires lesser accuracy, there may be advan- 
tages to having fewer shape point data in the database 
product used by the application. If fewer shape point 
data are included, the storage capacity requirements in 
the database are decreased. In addition, if fewer shape 
point data are included, an application using the data- 
base may run faster. 

[0132] The foregoing advantages relate to data- 
base products (110 in Figure 2) that are derived from 
the master copy 100 of the geographic database. With 
respect to the master copy of the geographic database 
100, similar considerations apply. Although all the 
smoothed fused data could be stored as shape point 
data in the master copy 100 of the geographic data- 
base, this would cause the size of the database to be 
very large. The large size of such a database may result 
in difficulties in handling, managing, updating, and 
maintenance. Accordingly, it is an advantage that only a 
portion of the smoothed fused data are stored as shape 
point data in the master copy of the geographic data- 
base. 

[0133] It is intended that the foregoing detailed 
description be regarded as illustrative rather than limit- 
ing and that it is understood that the following claims 
including all equivalents are intended to define the 
scope of the invention. 

Claims 

1. A method of storing data in a geographic database 
to represent roads, the method comprising the 
steps of : 

from a collection of source data that represent 
points along said roads, determining new posi- 
tions for those points represented by said 
source data, wherein said new positions have 
coordinates that are adjusted to align with cen- 
terlines of the roads represented thereby: and 
storing data that represent said new positions 
in a geographic database. 

2. The method of Claim 1 wherein said collection of 
source data is acquired while driving a vehicle 
along said roads. 



3. The method of Claim 1 wherein said collection of 
source data comprises raw sensor data. 

4. The method of Claim 1 wherein said collection of 
5 source data comprises raw sensor data formed as 

a result of a fusing step in which each raw sensor 
reading is modified to take into account sensor 
readings from a plurality of different types of sen- 
sors. 

10 

5. The method of Claim 1 wherein said source data 
are acquired using an inertial sensor system and a 
GPS system. 

1$ 6. The method of Claim 1 further comprising the step 

of: 

determining which of said points are necessary 
to represent the roads with a desired level of 
20 accuracy, 

and wherein the data stored in the geographic 
database excludes those points that were not 
determined to be necessary to represent the 
roads with the desired level of accuracy. 

25 

7. The method of Claim 6 wherein said step of deter- 
mining which of said points are necessary to repre- 
sent the roads with a desired level of accuracy is 
performed prior to the step of determining new 

30 positions. 

8. The method of Claim 7 wherein said step of deter- 
mining which of said points are necessary to repre- 
sent the roads with a desired level of accuracy is 

3S based upon an evaluation of said points such that a 
straight line connecting any two adjacent necessary 
points is not farther from any unnecessary points 
located between said two adjacent necessary 
points than a distance associated with said level of 

40 accuracy. 

9. The method of Claim 7 wherein said step of deter- 
mining which of said points are necessary to repre- 
sent the roads with a desired level of accuracy is 

45 based upon an evaluation of said points such that 
each point not determined to be necessary is 
located less than a distance associated with said 
level of accuracy away from a straight line that con- 
nects the closest necessary points on either side of 

50 said point determined not be necessary. 

10. The method of Claim 6 wherein said level of accu- 
racy is specified to be less than approximately 1 
meter. 

55 

11. The method of Claim 6 wherein said level of accu- 
racy is specified to be a value between approxi- 
mately 3 and 5 meters. 
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12. The method of Claim 6 wherein said tevef of accu- 
racy is specified to be a value between approxi- 
mately 1 and 3 meters. 

13. The method of Claim 1 wherein at least some of 5 
said centerlines correspond to a line extending 
along a center of all the lanes that form a repre- 
sented road. 

14. The method of Claim 1 wherein at least some of io 
said centerlines correspond to a line extending 
along a center of a!! the lanes of a represented road 
that have an identical legal direction of travel. 

15. The method of Claim 1 wherein, for those roads is 
whose shape is represented by a single grouping of 
points, said centerlines correspond to lines extend- 
ing along centers of all the lanes that form the 
respective represented roads and, for those roads 
represented by separate groupings of points lor 20 
each direction of travel, said centerlines conrespond 

to lines extending along centers of each of said 
grouping of lanes that form the respective repre- 
sented roads. 

25 

16. The method of Claim 1 wherein each new position, 
for those roads each of whose shape is repre- 
sented by a single grouping of points, corresponds 
to a displacement equal to a lane width times a total 
number of lanes divided by two plus one-half the 30 
lane width from the respective point along the road 
from which the new position was derived. 



a geographic database. 

19. The method of Claim 18 further comprising the step 
of: 

after the step of driving in a rightmost lane, 
determining which of said positions repre- 
sented by said acquired data are necessary to 
represent the roads with a desired level of 
accuracy. 

20. A method of forming a geographic database com- 
prising: 

providing as an input to an automatic centerlin- 
ing program a collection of source data that 
represent points along said roads acquired by 
a data collection system located in a vehicle 
that traveled along said roads; 
with said automatic centerlining program, 
determining new positions, wherein said new 
positions are displaced relative to said posi- 
tions represented by said acquired data by an 
distance corresponding to a distance between 
a lane in which said vehicle traveled and a cen- 
terline of the conresponding road; and 
storing data that represent the new positions in 
a geographic database. 

21. The geographic database stored on a computer 
readable medium formed using the process of 
Claim 1,1 8. or 20. 



17. The method of Claim 1 wherein each new position, 
for those roads each of whose shape is repre- 35 
sented by a single grouping of points, corresponds 
to a displacement equal to a sum of the widths of all 
the lanes divided by two plus one-half the lane 
width of the rightmost lane from the respective point 
along the road from which the new position was 40 
derived. 



18. A method of forming a geographic database com- 
prising: 

45 

driving in a rightmost lane of roads with a vehi- 
cle equipped with a positioning system that 
acquires data representative of positions of 
said vehicle over time as said vehicle travels 
along said roads; so 
using an automatic centerlining program to 
determine, for at least some of said positions 
represented by said acquired data, new posi- 
tions along centerlines of said roads, wherein 
said new positions are determined by a dis- 55 
placement relative to said positions repre- 
sented by said acquired data; and 
storing data that represent the new positions in 
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